neural rendering
EPIC Fields: Marrying 3D Geometry and Video Understanding
Neural rendering is fuelling a unification of learning, 3D geometry and video understanding that has been waiting for more than two decades. Progress, however, is still hampered by a lack of suitable datasets and benchmarks. To address this gap, we introduce EPIC Fields, an augmentation of EPIC-KITCHENS with 3D camera information. Like other datasets for neural rendering, EPIC Fields removes the complex and expensive step of reconstructing cameras using photogrammetry, and allows researchers to focus on modelling problems. We illustrate the challenge of photogrammetry in egocentric videos of dynamic actions and propose innovations to address them. Compared to other neural rendering datasets, EPIC Fields is better tailored to video understanding because it is paired with labelled action segments and the recent VISOR segment annotations. To further motivate the community, we also evaluate two benchmark tasks in neural rendering and segmenting dynamic objects, with strong baselines that showcase what is not possible today. We also highlight the advantage of geometry in semi-supervised video object segmentations on the VISOR annotations.
DEL: Discrete Element Learner for Learning 3D Particle Dynamics with Neural Rendering
Learning-based simulators show great potential for simulating particle dynamics when 3D groundtruth is available, but per-particle correspondences are not always accessible. The development of neural rendering presents a new solution to this field to learn 3D dynamics from 2D images by inverse rendering. However, existing approaches still suffer from ill-posed natures resulting from the 2D to 3D uncertainty, for example, specific 2D images can correspond with various 3D particle distributions. To mitigate such uncertainty, we consider a conventional, mechanically interpretable framework as the physical priors and extend it to a learning-based version. In brief, we incorporate the learnable graph kernels into the classic Discrete Element Analysis (DEA) framework to implement a novel mechanics-informed network architecture.
EPIC Fields: Marrying 3D Geometry and Video Understanding
Neural rendering is fuelling a unification of learning, 3D geometry and video understanding that has been waiting for more than two decades. Progress, however, is still hampered by a lack of suitable datasets and benchmarks. To address this gap, we introduce EPIC Fields, an augmentation of EPIC-KITCHENS with 3D camera information. Like other datasets for neural rendering, EPIC Fields removes the complex and expensive step of reconstructing cameras using photogrammetry, and allows researchers to focus on modelling problems. We illustrate the challenge of photogrammetry in egocentric videos of dynamic actions and propose innovations to address them.
The Radiance of Neural Fields: Democratizing Photorealistic and Dynamic Robotic Simulation
Nuthall, Georgina, Bowden, Richard, Mendez, Oscar
As robots increasingly coexist with humans, they must navigate complex, dynamic environments rich in visual information and implicit social dynamics, like when to yield or move through crowds. Addressing these challenges requires significant advances in vision-based sensing and a deeper understanding of socio-dynamic factors, particularly in tasks like navigation. To facilitate this, robotics researchers need advanced simulation platforms offering dynamic, photorealistic environments with realistic actors. Unfortunately, most existing simulators fall short, prioritizing geometric accuracy over visual fidelity, and employing unrealistic agents with fixed trajectories and low-quality visuals. To overcome these limitations, we developed a simulator that incorporates three essential elements: (1) photorealistic neural rendering of environments, (2) neurally animated human entities with behavior management, and (3) an ego-centric robotic agent providing multi-sensor output. By utilizing advanced neural rendering techniques in a dual-NeRF simulator, our system produces high-fidelity, photorealistic renderings of both environments and human entities. Additionally, it integrates a state-of-the-art Social Force Model to model dynamic human-human and human-robot interactions, creating the first photorealistic and accessible human-robot simulation system powered by neural rendering.
- Europe > United Kingdom > England > Surrey > Guildford (0.04)
- North America > United States > Massachusetts > Middlesex County > Wilmington (0.04)
- Europe > Germany > Baden-Württemberg > Freiburg (0.04)
- Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
A Comparison of Tiny-nerf versus Spatial Representations for 3d Reconstruction
Gante, Saulo Abraham, Vasquez, Juan Irving, Valencia, Marco Antonio, Carbajal, Mauricio Olguín
Neural rendering has emerged as a powerful paradigm for synthesizing images, offering many benefits over classical rendering by using neural networks to reconstruct surfaces, represent shapes, and synthesize novel views, either for objects or scenes. In this neural rendering, the environment is encoded into a neural network. We believe that these new representations can be used to codify the scene for a mobile robot. Therefore, in this work, we perform a comparison between a trending neural rendering, called tiny-NeRF, and other volume representations that are commonly used as maps in robotics, such as voxel maps, point clouds, and triangular meshes. The target is to know the advantages and disadvantages of neural representations in the robotics context. The comparison is made in terms of spatial complexity and processing time to obtain a model. Experiments show that tiny-NeRF requires three times less memory space compared to other representations. In terms of processing time, tiny-NeRF takes about six times more to compute the model.
- North America > Mexico (0.04)
- Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
Neural Rendering: A Brief Overview - weishaupt.ai
Neural rendering uses deep neural networks to create new images and video from existing scenes. The camera angles, lighting, and other details can be rendered into a realistic model of a 3D scene. In addition, neural rendering of existing images and videos can be used to generate synthetic data. Why it matters: Traditional 3D graphic rendering needs a model with a polygon mesh describing shape, color, and textures, as well as the lighting and camera position. Neural rendering simulates camera physics to separate the 3D scene from the camera capture process, making it easier to create new images from existing images and videos with consistency.
Multi-View Mesh Reconstruction with Neural Deferred Shading
Worchel, Markus, Diaz, Rodrigo, Hu, Weiwen, Schreer, Oliver, Feldmann, Ingo, Eisert, Peter
We propose an analysis-by-synthesis method for fast multi-view 3D reconstruction of opaque objects with arbitrary materials and illumination. State-of-the-art methods use both neural surface representations and neural rendering. While flexible, neural surface representations are a significant bottleneck in optimization runtime. Instead, we represent surfaces as triangle meshes and build a differentiable rendering pipeline around triangle rasterization and neural shading. The renderer is used in a gradient descent optimization where both a triangle mesh and a neural shader are jointly optimized to reproduce the multi-view images. We evaluate our method on a public 3D reconstruction dataset and show that it can match the reconstruction accuracy of traditional baselines and neural approaches while surpassing them in optimization runtime. Additionally, we investigate the shader and find that it learns an interpretable representation of appearance, enabling applications such as 3D material editing.
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > California > San Diego County > San Diego (0.04)
- North America > Canada (0.04)
- (3 more...)
AI lets you edit people talking in videos by adding or deleting words from a transcript
Actors who flub dialogue may become less of a bane to producers thanks to an AI that allows such to be edited just by retyping a transcript of their lines. The software works by combining existing clips with digital face models to create new footage that is lip-synced to match the desired edits. The technology could have the potential to be abused to create more creepy deepfake videos that make people appear to say things that they never did. However, the researchers say that these risks are more than outweighed by the benefits of the program, which could also cheaply localise or translate content. At present, the technique only works on videos with a forward-facing interview that are of a particular length - 40 minutes or over.
- North America > United States > California (0.05)
- Europe > Germany > Saarland > Saarbrücken (0.05)
- Information Technology > Security & Privacy (0.37)
- Media > Film (0.34)